Overview

Dataset info

Number of variables34
Number of observations119389
Missing cells129424 (3.2%)
Duplicate rows0 (0.0%)
Total size in memory100.8 MiB
Average record size in memory885.7 B

Variables types

NUM19
CAT12
BOOL2
DATE1

Reproduction info

Date of analysis2020-04-22 20:45:24.813272
Versionpandas-profiling v2.4.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download Configurationconfig.yaml

Warnings

adr has 1959 (1.6%) zeros Zeros
agent has 16340 (13.7%) missing values Missing
babies is highly skewed (γ1 = 24.64644163) Skewed
babies has 118472 (99.2%) zeros Zeros
booking_changes has 101314 (84.9%) zeros Zeros
children has 110795 (92.8%) zeros Zeros
company has 112592 (94.3%) missing values Missing
country has a high cardinality: 178 distinct values Warning
day_of_week has 18692 (15.7%) zeros Zeros
days_in_waiting_list has 115691 (96.9%) zeros Zeros
lead_time has 6345 (5.3%) zeros Zeros
previous_bookings_not_canceled is highly skewed (γ1 = 23.53970156) Skewed
previous_bookings_not_canceled has 115769 (97.0%) zeros Zeros
previous_cancellations is highly skewed (γ1 = 24.45794698) Skewed
previous_cancellations has 112905 (94.6%) zeros Zeros
required_car_parking_spaces has 111973 (93.8%) zeros Zeros
stays_in_week_nights has 7645 (6.4%) zeros Zeros
stays_in_weekend_nights has 51997 (43.6%) zeros Zeros
total_of_special_requests has 70317 (58.9%) zeros Zeros

Variables

adr
Real number (ℝ)

ZEROS
Distinct count8878
Unique (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean101.78674417241118
Minimum-6.38
Maximum510.0
Zeros1959
Zeros (%)1.6%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum-6.38
5-th percentile38.4
Q169.29
median94.56
Q3126
95-th percentile193.5
Maximum510
Range516.38
Interquartile range (IQR)56.71

Descriptive statistics

Standard deviation48.15355432
Coefficient of variation (CV)0.4730827645
Kurtosis2.131745226
Mean101.7867442
Median Absolute Deviation (MAD)36.32946907
Skewness1.017731447
Sum12152217.6
Variance2318.764794
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[-6.3800e+00 -3.1900e+00 1.3000e-01 5.6250e+00 6.2000e+00 ... 2.9525e+02 3.1625e+02 3.4350e+02 3.9469e+02 5.1000e+02], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
62 3754 3.1%
 
75 2715 2.3%
 
90 2473 2.1%
 
65 2418 2.0%
 
0 1959 1.6%
 
80 1889 1.6%
 
95 1661 1.4%
 
120 1607 1.3%
 
100 1573 1.3%
 
85 1538 1.3%
 
Other values (8868) 97802 81.9%
 
ValueCountFrequency (%) 
-6.38 1 < 0.1%
 
0 1959 1.6%
 
0.26 1 < 0.1%
 
0.5 1 < 0.1%
 
1 15 < 0.1%
 
ValueCountFrequency (%) 
510 1 < 0.1%
 
508 1 < 0.1%
 
451.5 1 < 0.1%
 
450 1 < 0.1%
 
437 1 < 0.1%
 

adults
Real number (ℝ≥0)

Distinct count14
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.856402181105462
Minimum0
Maximum55
Zeros403
Zeros (%)0.3%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum55
Range55
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5792632757
Coefficient of variation (CV)0.3120354423
Kurtosis1352.105314
Mean1.856402181
Median Absolute Deviation (MAD)0.3428875877
Skewness18.31774829
Sum221634
Variance0.3355459426
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 0.5 1.5 2.5 3.5 4.5 55. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2 89679 75.1%
 
1 23027 19.3%
 
3 6202 5.2%
 
0 403 0.3%
 
4 62 0.1%
 
26 5 < 0.1%
 
27 2 < 0.1%
 
20 2 < 0.1%
 
5 2 < 0.1%
 
55 1 < 0.1%
 
Other values (4) 4 < 0.1%
 
ValueCountFrequency (%) 
0 403 0.3%
 
1 23027 19.3%
 
2 89679 75.1%
 
3 6202 5.2%
 
4 62 0.1%
 
ValueCountFrequency (%) 
55 1 < 0.1%
 
50 1 < 0.1%
 
40 1 < 0.1%
 
27 2 < 0.1%
 
26 5 < 0.1%
 

agent
Real number (ℝ≥0)

MISSING
Distinct count334
Unique (%)0.3%
Missing16340
Missing (%)13.7%
Infinite0
Infinite (%)0.0%
Mean86.69410668711002
Minimum1.0
Maximum535.0
Zeros0
Zeros (%)0.0%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum1
5-th percentile1
Q19
median14
Q3229
95-th percentile250
Maximum535
Range534
Interquartile range (IQR)220

Descriptive statistics

Standard deviation110.7748408
Coefficient of variation (CV)1.277766679
Kurtosis-0.007212717535
Mean86.69410669
Median Absolute Deviation (MAD)97.04704534
Skewness1.089370905
Sum8933741
Variance12271.06534
Histogram
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
9 31961 26.8%
 
240 13922 11.7%
 
1 7191 6.0%
 
14 3640 3.0%
 
7 3539 3.0%
 
6 3290 2.8%
 
250 2870 2.4%
 
241 1721 1.4%
 
28 1666 1.4%
 
8 1514 1.3%
 
Other values (323) 31735 26.6%
 
(Missing) 16340 13.7%
 
ValueCountFrequency (%) 
1 7191 6.0%
 
2 162 0.1%
 
3 1336 1.1%
 
4 47 < 0.1%
 
5 330 0.3%
 
ValueCountFrequency (%) 
535 3 < 0.1%
 
531 68 0.1%
 
527 35 < 0.1%
 
526 10 < 0.1%
 
510 2 < 0.1%
 

arrival_date_day_of_month
Real number (ℝ≥0)

Distinct count31
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.798163984956737
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.78082586
Coefficient of variation (CV)0.5558130596
Kurtosis-1.187160324
Mean15.79816398
Median Absolute Deviation (MAD)7.578551325
Skewness-0.001983779709
Sum1886127
Variance77.10290278
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 1. 1.5 2.5 4.5 5.5 ... 20.5 23.5 26.5 30.5 31. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
17 4406 3.7%
 
5 4317 3.6%
 
15 4196 3.5%
 
25 4159 3.5%
 
26 4147 3.5%
 
9 4096 3.4%
 
12 4087 3.4%
 
16 4078 3.4%
 
2 4055 3.4%
 
19 4052 3.4%
 
Other values (21) 77796 65.2%
 
ValueCountFrequency (%) 
1 3626 3.0%
 
2 4055 3.4%
 
3 3855 3.2%
 
4 3763 3.2%
 
5 4317 3.6%
 
ValueCountFrequency (%) 
31 2208 1.8%
 
30 3853 3.2%
 
29 3580 3.0%
 
28 3946 3.3%
 
27 3802 3.2%
 
Distinct count12
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
August
13877
July
12661
May
11791
October
 
11160
April
 
11089
Other values (7)
58811
ValueCountFrequency (%) 
August 13877 11.6%
 
July 12661 10.6%
 
May 11791 9.9%
 
October 11160 9.3%
 
April 11089 9.3%
 
June 10939 9.2%
 
September 10508 8.8%
 
March 9793 8.2%
 
February 8068 6.8%
 
November 6794 5.7%
 
Other values (2) 12709 10.6%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length9
Mean length5.903190411
Min length3
Scatter

arrival_date_week_number
Real number (ℝ≥0)

Distinct count53
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.165291609779796
Minimum1
Maximum53
Zeros0
Zeros (%)0.0%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum1
5-th percentile5
Q116
median28
Q338
95-th percentile49
Maximum53
Range52
Interquartile range (IQR)22

Descriptive statistics

Standard deviation13.60513357
Coefficient of variation (CV)0.5008278123
Kurtosis-0.9860669654
Mean27.16529161
Median Absolute Deviation (MAD)11.5499026
Skewness-0.0100311295
Sum3243237
Variance185.0996594
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 1. 1.5 3.5 6.5 12.5 ... 49.5 50.5 51.5 52.5 53. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
33 3580 3.0%
 
30 3087 2.6%
 
32 3045 2.6%
 
34 3040 2.5%
 
18 2926 2.5%
 
21 2854 2.4%
 
28 2853 2.4%
 
17 2805 2.3%
 
20 2785 2.3%
 
29 2763 2.3%
 
Other values (43) 89651 75.1%
 
ValueCountFrequency (%) 
1 1047 0.9%
 
2 1218 1.0%
 
3 1319 1.1%
 
4 1487 1.2%
 
5 1387 1.2%
 
ValueCountFrequency (%) 
53 1816 1.5%
 
52 1195 1.0%
 
51 933 0.8%
 
50 1505 1.3%
 
49 1782 1.5%
 
Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
2016
56706
2017
40687
2015
21996
ValueCountFrequency (%) 
2016 56706 47.5%
 
2017 40687 34.1%
 
2015 21996 18.4%
 

Composition

Contains charsFalse
Contains digitsTrue
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length4
Mean length4
Min length4
Scatter
Distinct count12
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
A
74052
D
25322
E
 
7806
F
 
3751
G
 
2553
Other values (7)
 
5905
ValueCountFrequency (%) 
A 74052 62.0%
 
D 25322 21.2%
 
E 7806 6.5%
 
F 3751 3.1%
 
G 2553 2.1%
 
C 2375 2.0%
 
B 2163 1.8%
 
H 712 0.6%
 
I 363 0.3%
 
K 279 0.2%
 
Other values (2) 13 < 0.1%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length1
Mean length1
Min length1
Scatter

babies
Real number (ℝ≥0)

SKEWED
ZEROS
Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.007948806003903207
Minimum0
Maximum10
Zeros118472
Zeros (%)99.2%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum10
Range10
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.09743659665
Coefficient of variation (CV)12.25801669
Kurtosis1633.934639
Mean0.007948806004
Median Absolute Deviation (MAD)0.01577550603
Skewness24.64644163
Sum949
Variance0.009493890367
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 0.5 1.5 5.5 10. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 118472 99.2%
 
1 900 0.8%
 
2 15 < 0.1%
 
10 1 < 0.1%
 
9 1 < 0.1%
 
ValueCountFrequency (%) 
0 118472 99.2%
 
1 900 0.8%
 
2 15 < 0.1%
 
9 1 < 0.1%
 
10 1 < 0.1%
 
ValueCountFrequency (%) 
10 1 < 0.1%
 
9 1 < 0.1%
 
2 15 < 0.1%
 
1 900 0.8%
 
0 118472 99.2%
 

booking_changes
Real number (ℝ≥0)

ZEROS
Distinct count21
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.22111752338992705
Minimum0
Maximum21
Zeros101314
Zeros (%)84.9%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum21
Range21
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6523044096
Coefficient of variation (CV)2.950034894
Kurtosis79.3951055
Mean0.2211175234
Median Absolute Deviation (MAD)0.3752824928
Skewness6.000368154
Sum26399
Variance0.4255010428
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 0.5 1.5 2.5 3.5 ... 5.5 6.5 8.5 15.5 21. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 101314 84.9%
 
1 12700 10.6%
 
2 3805 3.2%
 
3 927 0.8%
 
4 376 0.3%
 
5 118 0.1%
 
6 63 0.1%
 
7 31 < 0.1%
 
8 17 < 0.1%
 
9 8 < 0.1%
 
Other values (11) 30 < 0.1%
 
ValueCountFrequency (%) 
0 101314 84.9%
 
1 12700 10.6%
 
2 3805 3.2%
 
3 927 0.8%
 
4 376 0.3%
 
ValueCountFrequency (%) 
21 1 < 0.1%
 
20 1 < 0.1%
 
18 1 < 0.1%
 
17 2 < 0.1%
 
16 2 < 0.1%
 

children
Real number (ℝ≥0)

ZEROS
Distinct count6
Unique (%)< 0.1%
Missing4
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.10389077354776563
Minimum0.0
Maximum10.0
Zeros110795
Zeros (%)92.8%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum10
Range10
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.3985630006
Coefficient of variation (CV)3.836365704
Kurtosis18.67349954
Mean0.1038907735
Median Absolute Deviation (MAD)0.192831231
Skewness4.112569428
Sum12403
Variance0.1588524655
Histogram
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 110795 92.8%
 
1 4861 4.1%
 
2 3652 3.1%
 
3 76 0.1%
 
10 1 < 0.1%
 
(Missing) 4 < 0.1%
 
ValueCountFrequency (%) 
0 110795 92.8%
 
1 4861 4.1%
 
2 3652 3.1%
 
3 76 0.1%
 
10 1 < 0.1%
 
ValueCountFrequency (%) 
10 1 < 0.1%
 
3 76 0.1%
 
2 3652 3.1%
 
1 4861 4.1%
 
0 110795 92.8%
 

company
Real number (ℝ≥0)

MISSING
Distinct count353
Unique (%)0.3%
Missing112592
Missing (%)94.3%
Infinite0
Infinite (%)0.0%
Mean189.26673532440782
Minimum6.0
Maximum543.0
Zeros0
Zeros (%)0.0%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum6
5-th percentile40
Q162
median179
Q3270
95-th percentile435
Maximum543
Range537
Interquartile range (IQR)208

Descriptive statistics

Standard deviation131.6550146
Coefficient of variation (CV)0.6956056721
Kurtosis-0.4907952103
Mean189.2667353
Median Absolute Deviation (MAD)109.1110502
Skewness0.6015996673
Sum1286446
Variance17333.04288
Histogram
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
40 927 0.8%
 
223 784 0.7%
 
67 267 0.2%
 
45 250 0.2%
 
153 215 0.2%
 
174 149 0.1%
 
219 141 0.1%
 
281 138 0.1%
 
154 133 0.1%
 
405 119 0.1%
 
Other values (342) 3674 3.1%
 
(Missing) 112592 94.3%
 
ValueCountFrequency (%) 
6 1 < 0.1%
 
8 1 < 0.1%
 
9 37 < 0.1%
 
10 1 < 0.1%
 
11 1 < 0.1%
 
ValueCountFrequency (%) 
543 2 < 0.1%
 
541 1 < 0.1%
 
539 2 < 0.1%
 
534 2 < 0.1%
 
531 1 < 0.1%
 

country
Categorical

HIGH CARDINALITY
Distinct count178
Unique (%)0.1%
Missing488
Missing (%)0.4%
Memory size932.8 KiB
PRT
48589
GBR
12129
FRA
10415
ESP
 
8568
DEU
 
7287
Other values (172)
31913
ValueCountFrequency (%) 
PRT 48589 40.7%
 
GBR 12129 10.2%
 
FRA 10415 8.7%
 
ESP 8568 7.2%
 
DEU 7287 6.1%
 
ITA 3766 3.2%
 
IRL 3375 2.8%
 
BEL 2342 2.0%
 
BRA 2224 1.9%
 
NLD 2104 1.8%
 
Other values (167) 18102 15.2%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length3
Mean length2.98928712
Min length2
Scatter

customer_type
Categorical

Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
Transient
89612
Transient-Party
25124
Contract
 
4076
Group
 
577
ValueCountFrequency (%) 
Transient 89612 75.1%
 
Transient-Party 25124 21.0%
 
Contract 4076 3.4%
 
Group 577 0.5%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length15
Mean length10.20915662
Min length5
Scatter

day_of_week
Real number (ℝ≥0)

ZEROS
Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.885257435777165
Minimum0
Maximum6
Zeros18692
Zeros (%)15.7%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q34
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.984294259
Coefficient of variation (CV)0.6877356019
Kurtosis-1.186485604
Mean2.885257436
Median Absolute Deviation (MAD)1.69793948
Skewness0.0777885192
Sum344468
Variance3.937423708
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[0. 0.5 1.5 4.5 5.5 6. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 18692 15.7%
 
4 18470 15.5%
 
3 18141 15.2%
 
2 17943 15.0%
 
1 16731 14.0%
 
6 16488 13.8%
 
5 12924 10.8%
 
ValueCountFrequency (%) 
0 18692 15.7%
 
1 16731 14.0%
 
2 17943 15.0%
 
3 18141 15.2%
 
4 18470 15.5%
 
ValueCountFrequency (%) 
6 16488 13.8%
 
5 12924 10.8%
 
4 18470 15.5%
 
3 18141 15.2%
 
2 17943 15.0%
 

days_in_waiting_list
Real number (ℝ≥0)

ZEROS
Distinct count128
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.3211686168742514
Minimum0
Maximum391
Zeros115691
Zeros (%)96.9%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum391
Range391
Interquartile range (IQR)0

Descriptive statistics

Standard deviation17.59479328
Coefficient of variation (CV)7.580144396
Kurtosis186.7914825
Mean2.321168617
Median Absolute Deviation (MAD)4.498836213
Skewness11.94430274
Sum277122
Variance309.5767507
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 0.5 2.5 3.5 4.5 ... 219. 223.5 247.5 385. 391. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 115691 96.9%
 
39 227 0.2%
 
58 164 0.1%
 
44 141 0.1%
 
31 127 0.1%
 
35 96 0.1%
 
46 94 0.1%
 
69 89 0.1%
 
63 83 0.1%
 
50 80 0.1%
 
Other values (118) 2597 2.2%
 
ValueCountFrequency (%) 
0 115691 96.9%
 
1 12 < 0.1%
 
2 5 < 0.1%
 
3 59 < 0.1%
 
4 25 < 0.1%
 
ValueCountFrequency (%) 
391 45 < 0.1%
 
379 15 < 0.1%
 
330 15 < 0.1%
 
259 10 < 0.1%
 
236 35 < 0.1%
 

deposit_type
Categorical

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
No Deposit
104641
Non Refund
 
14586
Refundable
 
162
ValueCountFrequency (%) 
No Deposit 104641 87.6%
 
Non Refund 14586 12.2%
 
Refundable 162 0.1%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceTrue
Contains non-wordsTrue

Length

Max length10
Mean length10
Min length10
Scatter

df_index
Real number (ℝ≥0)

UNIQUE
Distinct count119389
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean59694.59363928
Minimum0
Maximum119389
Zeros1
Zeros (%)< 0.1%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile5969.4
Q129847
median59695
Q389542
95-th percentile113419.6
Maximum119389
Range119389
Interquartile range (IQR)59695

Descriptive statistics

Standard deviation34465.19781
Coefficient of variation (CV)0.5773587809
Kurtosis-1.200011997
Mean59694.59364
Median Absolute Deviation (MAD)29847.65636
Skewness-7.865032929e-06
Sum7126877840
Variance1187849860
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 119389.], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2047 1 < 0.1%
 
19747 1 < 0.1%
 
118035 1 < 0.1%
 
103704 1 < 0.1%
 
105753 1 < 0.1%
 
99610 1 < 0.1%
 
101659 1 < 0.1%
 
111900 1 < 0.1%
 
113949 1 < 0.1%
 
107806 1 < 0.1%
 
Other values (119379) 119379 > 99.9%
 
ValueCountFrequency (%) 
0 1 < 0.1%
 
1 1 < 0.1%
 
2 1 < 0.1%
 
3 1 < 0.1%
 
4 1 < 0.1%
 
ValueCountFrequency (%) 
119389 1 < 0.1%
 
119388 1 < 0.1%
 
119387 1 < 0.1%
 
119386 1 < 0.1%
 
119385 1 < 0.1%
 
Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
TA/TO
97869
Direct
 
14645
Corporate
 
6677
GDS
 
193
Undefined
 
5
ValueCountFrequency (%) 
TA/TO 97869 82.0%
 
Direct 14645 12.3%
 
Corporate 6677 5.6%
 
GDS 193 0.2%
 
Undefined 5 < 0.1%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length9
Mean length5.343306335
Min length3
Scatter

hotel
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
City Hotel
79329
Resort Hotel
40060
ValueCountFrequency (%) 
City Hotel 79329 66.4%
 
Resort Hotel 40060 33.6%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceTrue
Contains non-wordsTrue

Length

Max length12
Mean length10.6710836
Min length10
Scatter
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
0
75166
1
44223
ValueCountFrequency (%) 
0 75166 63.0%
 
1 44223 37.0%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
0
115579
1
 
3810
ValueCountFrequency (%) 
0 115579 96.8%
 
1 3810 3.2%
 

lead_time
Real number (ℝ≥0)

ZEROS
Distinct count479
Unique (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean104.01199440484467
Minimum0
Maximum737
Zeros6345
Zeros (%)5.3%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q118
median69
Q3160
95-th percentile320
Maximum737
Range737
Interquartile range (IQR)142

Descriptive statistics

Standard deviation106.8633579
Coefficient of variation (CV)1.027413795
Kurtosis1.696411731
Mean104.0119944
Median Absolute Deviation (MAD)84.67224136
Skewness1.346537317
Sum12417888
Variance11419.77727
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[0.000e+00 5.000e-01 1.500e+00 2.500e+00 4.500e+00 ... 6.065e+02 6.240e+02 6.275e+02 6.690e+02 7.370e+02], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 6345 5.3%
 
1 3460 2.9%
 
2 2069 1.7%
 
3 1816 1.5%
 
4 1715 1.4%
 
5 1565 1.3%
 
6 1445 1.2%
 
7 1331 1.1%
 
8 1138 1.0%
 
12 1079 0.9%
 
Other values (469) 97426 81.6%
 
ValueCountFrequency (%) 
0 6345 5.3%
 
1 3460 2.9%
 
2 2069 1.7%
 
3 1816 1.5%
 
4 1715 1.4%
 
ValueCountFrequency (%) 
737 1 < 0.1%
 
709 1 < 0.1%
 
629 17 < 0.1%
 
626 30 < 0.1%
 
622 17 < 0.1%
 

market_segment
Categorical

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
Online TA
56477
Offline TA/TO
24218
Groups
19811
Direct
12606
Corporate
 
5295
Other values (3)
 
982
ValueCountFrequency (%) 
Online TA 56477 47.3%
 
Offline TA/TO 24218 20.3%
 
Groups 19811 16.6%
 
Direct 12606 10.6%
 
Corporate 5295 4.4%
 
Complementary 743 0.6%
 
Aviation 237 0.2%
 
Undefined 2 < 0.1%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceTrue
Contains non-wordsTrue

Length

Max length13
Mean length9.019733811
Min length6
Scatter

meal
Categorical

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
BB
92309
HB
 
14463
SC
 
10650
Undefined
 
1169
FB
 
798
ValueCountFrequency (%) 
BB 92309 77.3%
 
HB 14463 12.1%
 
SC 10650 8.9%
 
Undefined 1169 1.0%
 
FB 798 0.7%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length9
Mean length2.068540653
Min length2
Scatter

previous_bookings_not_canceled
Real number (ℝ≥0)

SKEWED
ZEROS
Distinct count73
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.13709805760999758
Minimum0
Maximum72
Zeros115769
Zeros (%)97.0%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum72
Range72
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.497443066
Coefficient of variation (CV)10.92242365
Kurtosis767.2387944
Mean0.1370980576
Median Absolute Deviation (MAD)0.2658822007
Skewness23.53970156
Sum16368
Variance2.242335737
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 0.5 1.5 2.5 3.5 ... 10.5 14.5 25.5 30.5 72. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 115769 97.0%
 
1 1542 1.3%
 
2 580 0.5%
 
3 333 0.3%
 
4 229 0.2%
 
5 181 0.2%
 
6 115 0.1%
 
7 88 0.1%
 
8 70 0.1%
 
9 60 0.1%
 
Other values (63) 422 0.4%
 
ValueCountFrequency (%) 
0 115769 97.0%
 
1 1542 1.3%
 
2 580 0.5%
 
3 333 0.3%
 
4 229 0.2%
 
ValueCountFrequency (%) 
72 1 < 0.1%
 
71 1 < 0.1%
 
70 1 < 0.1%
 
69 1 < 0.1%
 
68 1 < 0.1%
 

previous_cancellations
Real number (ℝ≥0)

SKEWED
ZEROS
Distinct count15
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08711857876353768
Minimum0
Maximum26
Zeros112905
Zeros (%)94.6%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum26
Range26
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8443398826
Coefficient of variation (CV)9.691846384
Kurtosis674.0680579
Mean0.08711857876
Median Absolute Deviation (MAD)0.1647743617
Skewness24.45794698
Sum10401
Variance0.7129098374
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 0.5 1.5 2.5 3.5 5.5 20. 22.5 25.5 26. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 112905 94.6%
 
1 6051 5.1%
 
2 116 0.1%
 
3 65 0.1%
 
24 48 < 0.1%
 
11 35 < 0.1%
 
4 31 < 0.1%
 
26 26 < 0.1%
 
25 25 < 0.1%
 
6 22 < 0.1%
 
Other values (5) 65 0.1%
 
ValueCountFrequency (%) 
0 112905 94.6%
 
1 6051 5.1%
 
2 116 0.1%
 
3 65 0.1%
 
4 31 < 0.1%
 
ValueCountFrequency (%) 
26 26 < 0.1%
 
25 25 < 0.1%
 
24 48 < 0.1%
 
21 1 < 0.1%
 
19 19 < 0.1%
 

required_car_parking_spaces
Real number (ℝ≥0)

ZEROS
Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.06251832245851795
Minimum0
Maximum8
Zeros111973
Zeros (%)93.8%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum8
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.245292108
Coefficient of variation (CV)3.923523511
Kurtosis29.99778011
Mean0.06251832246
Median Absolute Deviation (MAD)0.1172698343
Skewness4.163212935
Sum7464
Variance0.06016821826
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[0. 0.5 1.5 2.5 8. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 111973 93.8%
 
1 7383 6.2%
 
2 28 < 0.1%
 
3 3 < 0.1%
 
8 2 < 0.1%
 
ValueCountFrequency (%) 
0 111973 93.8%
 
1 7383 6.2%
 
2 28 < 0.1%
 
3 3 < 0.1%
 
8 2 < 0.1%
 
ValueCountFrequency (%) 
8 2 < 0.1%
 
3 3 < 0.1%
 
2 28 < 0.1%
 
1 7383 6.2%
 
0 111973 93.8%
 
Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
Check-Out
75166
Canceled
43016
No-Show
 
1207
ValueCountFrequency (%) 
Check-Out 75166 63.0%
 
Canceled 43016 36.0%
 
No-Show 1207 1.0%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length9
Mean length8.619479181
Min length7
Scatter
Distinct count926
Unique (%)0.8%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
Minimum2014-10-17 00:00:00
Maximum2017-09-14 00:00:00
Mini histogram
Histogram
Histogram
Distinct count10
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.8 KiB
A
85993
D
19201
E
 
6535
F
 
2897
G
 
2094
Other values (5)
 
2669
ValueCountFrequency (%) 
A 85993 72.0%
 
D 19201 16.1%
 
E 6535 5.5%
 
F 2897 2.4%
 
G 2094 1.8%
 
B 1118 0.9%
 
C 932 0.8%
 
H 601 0.5%
 
P 12 < 0.1%
 
L 6 < 0.1%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length1
Mean length1
Min length1
Scatter

stays_in_week_nights
Real number (ℝ≥0)

ZEROS
Distinct count35
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.5003140992888793
Minimum0
Maximum50
Zeros7645
Zeros (%)6.4%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile5
Maximum50
Range50
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.908288667
Coefficient of variation (CV)0.7632195761
Kurtosis24.28453022
Mean2.500314099
Median Absolute Deviation (MAD)1.364288191
Skewness2.862243798
Sum298510
Variance3.641565637
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 0.5 1.5 2.5 3.5 ... 18.5 20.5 21.5 25.5 50. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2 33684 28.2%
 
1 30309 25.4%
 
3 22258 18.6%
 
5 11077 9.3%
 
4 9563 8.0%
 
0 7645 6.4%
 
6 1499 1.3%
 
10 1036 0.9%
 
7 1029 0.9%
 
8 656 0.5%
 
Other values (25) 633 0.5%
 
ValueCountFrequency (%) 
0 7645 6.4%
 
1 30309 25.4%
 
2 33684 28.2%
 
3 22258 18.6%
 
4 9563 8.0%
 
ValueCountFrequency (%) 
50 1 < 0.1%
 
42 1 < 0.1%
 
41 1 < 0.1%
 
40 2 < 0.1%
 
35 1 < 0.1%
 

stays_in_weekend_nights
Real number (ℝ≥0)

ZEROS
Distinct count17
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9276063958991196
Minimum0
Maximum19
Zeros51997
Zeros (%)43.6%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile2
Maximum19
Range19
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.9986140682
Coefficient of variation (CV)1.076549356
Kurtosis7.174078723
Mean0.9276063959
Median Absolute Deviation (MAD)0.8079931948
Skewness1.380039003
Sum110746
Variance0.9972300573
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 0.5 1.5 2.5 3.5 ... 6.5 7.5 8.5 11. 19. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 51997 43.6%
 
2 33308 27.9%
 
1 30626 25.7%
 
4 1855 1.6%
 
3 1259 1.1%
 
6 153 0.1%
 
5 79 0.1%
 
8 60 0.1%
 
7 19 < 0.1%
 
9 11 < 0.1%
 
Other values (7) 22 < 0.1%
 
ValueCountFrequency (%) 
0 51997 43.6%
 
1 30626 25.7%
 
2 33308 27.9%
 
3 1259 1.1%
 
4 1855 1.6%
 
ValueCountFrequency (%) 
19 1 < 0.1%
 
18 1 < 0.1%
 
16 3 < 0.1%
 
14 2 < 0.1%
 
13 3 < 0.1%
 

total_of_special_requests
Real number (ℝ≥0)

ZEROS
Distinct count6
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5713675464238749
Minimum0
Maximum5
Zeros70317
Zeros (%)58.9%
Memory size932.8 KiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7928000185
Coefficient of variation (CV)1.387548214
Kurtosis1.492531434
Mean0.5713675464
Median Absolute Deviation (MAD)0.673041097
Skewness1.349177557
Sum68215
Variance0.6285318694
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[0. 1.5 2.5 3.5 4.5 5. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 70317 58.9%
 
1 33226 27.8%
 
2 12969 10.9%
 
3 2497 2.1%
 
4 340 0.3%
 
5 40 < 0.1%
 
ValueCountFrequency (%) 
0 70317 58.9%
 
1 33226 27.8%
 
2 12969 10.9%
 
3 2497 2.1%
 
4 340 0.3%
 
ValueCountFrequency (%) 
5 40 < 0.1%
 
4 340 0.3%
 
3 2497 2.1%
 
2 12969 10.9%
 
1 33226 27.8%
 

Correlations

Missing values

Sample

First rows

adradultsagentarrival_date_day_of_montharrival_date_montharrival_date_week_numberarrival_date_yearassigned_room_typebabiesbooking_changeschildrencompanycountrycustomer_typeday_of_weekdays_in_waiting_listdeposit_typedf_indexdistribution_channelhotelis_canceledis_repeated_guestlead_timemarket_segmentmealprevious_bookings_not_canceledprevious_cancellationsrequired_car_parking_spacesreservation_statusreservation_status_datereserved_room_typestays_in_week_nightsstays_in_weekend_nightstotal_of_special_requests
00.02NaN1July272015C030.0NaNPRTTransient20No Deposit0DirectResort Hotel00342DirectBB000Check-Out2015-07-01C000
10.02NaN1July272015C040.0NaNPRTTransient20No Deposit1DirectResort Hotel00737DirectBB000Check-Out2015-07-01C000
275.01NaN1July272015C000.0NaNGBRTransient30No Deposit2DirectResort Hotel007DirectBB000Check-Out2015-07-02A100
375.01304.01July272015A000.0NaNGBRTransient30No Deposit3CorporateResort Hotel0013CorporateBB000Check-Out2015-07-02A100
498.02240.01July272015A000.0NaNGBRTransient40No Deposit4TA/TOResort Hotel0014Online TABB000Check-Out2015-07-03A201
598.02240.01July272015A000.0NaNGBRTransient40No Deposit5TA/TOResort Hotel0014Online TABB000Check-Out2015-07-03A201
6107.02NaN1July272015C000.0NaNPRTTransient40No Deposit6DirectResort Hotel000DirectBB000Check-Out2015-07-03C200
7103.02303.01July272015C000.0NaNPRTTransient40No Deposit7DirectResort Hotel009DirectFB000Check-Out2015-07-03C201
882.02240.01July272015A000.0NaNPRTTransient20No Deposit8TA/TOResort Hotel1085Online TABB000Canceled2015-05-06A301
9105.5215.01July272015D000.0NaNPRTTransient20No Deposit9TA/TOResort Hotel1075Offline TA/TOHB000Canceled2015-04-22D300

Last rows

adradultsagentarrival_date_day_of_montharrival_date_montharrival_date_week_numberarrival_date_yearassigned_room_typebabiesbooking_changeschildrencompanycountrycustomer_typeday_of_weekdays_in_waiting_listdeposit_typedf_indexdistribution_channelhotelis_canceledis_repeated_guestlead_timemarket_segmentmealprevious_bookings_not_canceledprevious_cancellationsrequired_car_parking_spacesreservation_statusreservation_status_datereserved_room_typestays_in_week_nightsstays_in_weekend_nightstotal_of_special_requests
119379140.7529.031August352017A000.0NaNDEUTransient00No Deposit119380TA/TOCity Hotel0044Online TASC000Check-Out2017-09-04A311
11938099.00214.031August352017A000.0NaNDEUTransient10No Deposit119381DirectCity Hotel00188DirectBB000Check-Out2017-09-05A320
119381209.0037.030August352017G000.0NaNJPNTransient10No Deposit119382TA/TOCity Hotel00135Online TABB000Check-Out2017-09-05G420
11938287.60242.031August352017A000.0NaNDEUTransient20No Deposit119383TA/TOCity Hotel00164Offline TA/TOBB000Check-Out2017-09-06A420
11938396.142394.030August352017A000.0NaNBELTransient20No Deposit119384TA/TOCity Hotel0021Offline TA/TOBB000Check-Out2017-09-06A522
11938496.142394.030August352017A000.0NaNBELTransient20No Deposit119385TA/TOCity Hotel0023Offline TA/TOBB000Check-Out2017-09-06A520
119385225.4339.031August352017E000.0NaNFRATransient30No Deposit119386TA/TOCity Hotel00102Online TABB000Check-Out2017-09-07E522
119386157.7129.031August352017D000.0NaNDEUTransient30No Deposit119387TA/TOCity Hotel0034Online TABB000Check-Out2017-09-07D524
119387104.40289.031August352017A000.0NaNGBRTransient30No Deposit119388TA/TOCity Hotel00109Online TABB000Check-Out2017-09-07A520
119388151.2029.029August352017A000.0NaNDEUTransient30No Deposit119389TA/TOCity Hotel00205Online TAHB000Check-Out2017-09-07A722